Brian S. Evans, Ph.D.
Migratory Bird Center
Smithsonian Conservation Biology Institute
# Load RCurl library:
library(RCurl)
# Load a source script:
script <-
getURL(
"https://raw.githubusercontent.com/bsevansunc/workshop_languageOfR/master/sourceCode.R"
)
# Evaluate then remove the source script:
eval(parse(text = script))
rm(script)Why would you use for loops?
# Filter irisTbl to setosa:
irisTbl[irisTbl$species == 'setosa', ]
# Extract the petalLength field (column):
irisTbl[irisTbl$species == 'setosa', ]$petalLength
# Calculate the mean of petal lengths:
mean(irisTbl[irisTbl$species == 'setosa', ]$petalLength)Calculate the mean petal length of each of the Iris species using matrix notation (as above) and a custom function.
Calculate the mean petal length of each of the Iris species using matrix notation (as above) and a custom function.
# Mean petal lengths, matrix notation:
mean(irisTbl[irisTbl$species == 'setosa', ]$petalLength)
mean(irisTbl[irisTbl$species == 'versicolor', ]$petalLength)
mean(irisTbl[irisTbl$species == 'virginica', ]$petalLength)
# Mean petal lengths, function method:
meanPetalFun <- function(spp){
mean(irisTbl[irisTbl$species == spp, ]$petalLength)
}
meanPetalFun('setosa')
meanPetalFun('versicolor')
meanPetalFun('virginica')
Consider the following numeric vector, v:
| [1] | [2] | [3] | [4] | [5] |
|---|---|---|---|---|
| 1 | 1 | 2 | 3 | 5 |
| [1] | [2] | [3] | [4] | [5] |
|---|---|---|---|---|
| 1 | 1 | 2 | 3 | 5 |
Vector v is an R object comprised of five numbers.
# Explore vector v:
v
class(v)
str(v)
length(v)| [1] | [2] | [3] | [4] | [5] |
|---|---|---|---|---|
| 1 | 1 | 2 | 3 | 5 |
Each value in a vector has a position, denoted by “[i]”.
Recall: v[i] is the value of v at position i.
# Explore vector v using indexing:
i <- 3
v[i]
v[3]
v[3] == v[i]Each value in a vector has a position, denoted by “[i]”.
Recall: v[i] is the value of v at position i.
# Add 1 to the value of v at position three:
i <- 3
v[3] + 1
v[i] + 1Writing proper for loops requires following these three steps:
ALWAYS specify an object to store your output!
Vector objects are defined as:
# Define a vector for output:
vNew <- vector('numeric', length = length(v))
str(vNew)ALWAYS specify an object to store your output!
# Explore filling values of vNew by index:
i <- 3
v[i]
vNew[i] <- v[i] + 1
vNew[i]
v[i] + 1 == vNew[i]The sequence can be defined using:
v
1:5
1:length(v)
seq_along(v)
# Example for loop sequence statements:
# for(i in 1:length(v))
# for(i in seq_along(v))The for loop body describes what will happen at each iteration of the loop. For example:
i <- 3
vNew[i] <- v[i] + 1# For loop output:
vNew <- vector('numeric',length = length(v))
# For loop sequence:
for(i in seq_along(v)){
# For loop body:
vNew[i] <- v[i] + 1
}
# Explore first for loop output:
vNew
vNew == v + 1m, b, and x.
x to the vector object.
y where: m = 0.5, b = 1.0, and x refers to the vector x above (Note: A for loop is not really required here).
m, b, and x.
linearModel <- function(m, x, b){
m*x+b
}x to the vector object.
x <- 1:10y where: m = 0.5, b = 1.0, and x refers to the vector x above (Note: A for loop is not really required here).
x <- 1:10
y <- vector('numeric',length = length(x))
for(i in seq_along(x)){
y[i] <- linearModel(m = 0.5, b = 1.0, x = x[i])
}Split-Apply-Combine
# Mean petal lengths of Iris species without a for loop:
mean(irisTbl[irisTbl$species == 'setosa', ]$petalLength)
mean(irisTbl[irisTbl$species == 'versicolor', ]$petalLength)
mean(irisTbl[irisTbl$species == 'virginica', ]$petalLength)Split-Apply-Combine
Start by creating a vector of species:
# Make a vector of species to loop across:
irisSpecies <- levels(irisTbl$species)
irisSpeciesSplit-Apply-Combine
Create an empty vector to store our output:
# For loop output statement:
petalLengths <- vector('numeric',length = length(irisSpecies))
petalLengthsSplit-Apply-Combine
Split: The for loop body, starts with splitting the data
# Exploring the iris data, subsetting by species:
i <- 3
irisSpecies[i]
irisTbl[irisTbl$species == irisSpecies[i], ]
# Split:
iris_sppSubset <- irisTbl[irisTbl$species == irisSpecies[i], ]Split-Apply-Combine
Apply: Modification of the data:
# Calculate mean petal length of each subset (apply):
mean(iris_sppSubset$petalLength)Split-Apply-Combine
# Make a vector of species to loop across:
irisSpecies <- levels(irisTbl$species)
# For loop output statement:
petalLengths <- vector('numeric',length = length(irisSpecies))
# For loop:
for(i in seq_along(irisSpecies)){
# Split:
iris_sppSubset <- irisTbl[irisTbl$species == irisSpecies[i], ]
# Apply:
petalLengths[i] <- mean(iris_sppSubset$petalLength)
}Split-Apply-Combine
Combine: Combining the for loop output
# Make a tibble data frame of the for loop output (combine):
petalLengthFrame <-
data_frame(species = irisSpecies, count = petalLengths)
petalLengthFrame
Use a for loop and the birdHabits data frame to calculate the number species in each diet guild.
Use a for loop and the birdHabits data frame to calculate the number species in each diet guild.
birdHabits
diets <- unique(birdHabits$diet)
outVector <- vector('numeric', length = length(diets))
for(i in seq_along(outVector)){
# Split:
dietSubset <- birdHabits[birdHabits$diet == diets[i],]
# Apply:
outVector[i] <- nrow(dietSubset)
}
# Combine:
data_frame(diet = diets, nSpecies = outVector)For loops can be used to explore data objects with common features.
How many omnivorous birds were observed at each site?
# Explore the bird count data:
head(birdCounts)
str(birdCounts)
# Explore the bird trait data:
head(birdHabits)
str(birdHabits)
Example, site == 'apple'
# Extract vector of omnivorous species:
omnivores <- birdHabits[birdHabits$diet == 'omnivore',]$species
# Subset the counts to omnivores:
birdCounts[birdCounts$species %in% omnivores, ]$count
# Calculate the sum of counts:
sum(birdCounts[birdCounts$species %in% omnivores, ]$count)
Example, site == 'apple'
# Subset the omnivore counts to site apple:
birdCounts[birdCounts$species %in% omnivores &
birdCounts$site == 'apple', ]
# Extract the count column:
birdCounts[birdCounts$species %in% omnivores &
birdCounts$site == 'apple', ]$count
# Calculate the sum:
sum(birdCounts[birdCounts$species %in% omnivores &
birdCounts$site == 'apple', ]$count)
Using the birdHabits and birdCounts data frames, modify the function below such that it will calculate the number of species of a given guild at a selected site.
richnessSiteGuild <- function(site, guild){
guildSpp <- birdHabits[birdHabits$foraging # COMPLETE
countSppSubset <- birdCounts[birdCounts$ # COMPLETE
countSppSiteSubset <- countSppSubset[# COMPLETE
nSpp <- # COMPLETE
return(nSpp)
}
richnessSiteGuild('apple', 'ground')
Using the birdHabits and birdCounts data frames, modify the function below such that it will calculate the number of species of a given guild at a selected site.
richnessSiteGuild <- function(site, guild){
guildSpp <- birdHabits[birdHabits$foraging == guild,]$species
countSppSubset <- birdCounts[birdCounts$species %in% guildSpp,]
countSppSiteSubset <- countSppSubset[countSppSubset$site == site,]
nSpp <- length(unique(countSppSiteSubset$species))
return(nSpp)
}
richnessSiteGuild('apple', 'ground')How many omnivorous birds were observed at each site?
Get a vector of birds that are ground foragers from the birdHabits data frame:
# Extract vector of omnivorous species:
omnivores <- birdHabits[birdHabits$diet == 'omnivore',]$speciesHow many omnivorous birds were observed at each site?
Split the data into individual sites.
# Generate a vector of unique sites:
sites <- unique(birdCounts$site)
# Site at position i:
i <- 3
sites[i]
# Subset data:
birdCounts_siteSubset <- birdCounts[birdCounts$site == sites[i],]
birdCounts_siteSubsetHow many omnivorous birds were observed at each site?
Split: Use %in% to extract only records associated with omnivores and sum the count field.
# Just a vector of omnivore counts:
countVector <-
birdCounts_siteSubset[birdCounts_siteSubset$species %in%
omnivores,]$countHow many omnivorous birds were observed at each site?
Apply: Sum the count vector.
# Get total number of omnivores at the site:
nOmnivores <- sum(countVector)How many omnivorous birds were observed at each site?
Combine: Values combined using the vector method
sites <- unique(birdCounts$site)
outVector <- vector('numeric', length = length(sites))
for(i in seq_along(sites)){
birdCounts_siteSubset <- birdCounts[birdCounts$site == sites[i],]
countVector <-
birdCounts_siteSubset[birdCounts_siteSubset$species %in%
omnivores, ]$count
outVector[i] <- sum(countVector)
}
# Combine:
data_frame(site = sites, nOmnivores = outVector)How many omnivorous birds were observed at each site?
Combine: Values combined using the list method
sites <- unique(birdCounts$site)
outList <- vector('list', length = length(sites))
for(i in seq_along(sites)){
birdCounts_siteSubset <- birdCounts[birdCounts$site == sites[i],]
countVector <-
birdCounts_siteSubset[birdCounts_siteSubset$species %in%
omnivores,]$count
outList[[i]] <- data_frame(
site = sites[i],
nOmnivores = sum(countVector))
}
# Combine:
bind_rows(outList)
Using the richnessSiteGuild function you created in Exercies Four and the birdHabits and birdCounts data frames, modify the for loop code below to count the number of species that are ground foragers at each site.
sites <- unique(# COMPLETE
outList <- vector('list', length = # COMPLETE
for(i in # COMPLETE
outList[[i]] <- data_frame(site = sites[i],
# COMPLETE
}
bind_rows(# COMPLETE
Using the richnessSiteGuild function you created in Exercies Four and the birdHabits and birdCounts data frames, write a for loop that will count the number of observed species that are ground foragers at each site.
sites <- unique(birdCounts$site)
outList <- vector('list', length = length(sites))
for(i in seq_along(sites)) {
outList[[i]] <- data_frame(site = sites[i],
nSpecies = richnessSiteGuild(sites[i], 'ground'))
}
bind_rows(outList)For loop to generate a vector of numbers based on some mathematical function. For example:
\[n_t = 2(n_{t-1})\]
For loop to generate a vector of numbers based on some mathematical function. For example:
\[n_t = 2(n_{t-1})\]
# For loop output:
n <- vector('numeric', length = 5)
n
# Set the seed value:
n[1] <- 10
nFor loop to generate a vector of numbers based on some mathematical function. For example:
\[n_t = 2(n_{t-1})\]
# For loop sequence:
# for(i in 2:length(n))For loop to generate a vector of numbers based on some mathematical function. For example:
\[n_t = 2(n_{t-1})\]
Body: For each iteration (example, position 2):
# Exploring the construction of the for loop body:
i <- 2
n[i]
n[i-1]
n[i] <- 2*n[i-1]
nFor loop to generate a vector of numbers based on some mathematical function. For example:
\[n_t = 2(n_{t-1})\]
# Output:
n <- vector('numeric', length = 5)
# Seed:
n[1] <- 10
# For loop:
for(i in 2:5){
n[i] = n*v[i-1]
}One of my favorite for loops was created by Leonardo Bonacci (Fibonacci). He created the first known population model, from which the famous Fibonacci number series was created. He described a population (N) of rabbits at time t as the sum of the population at the previous time step plus the time step before that:
\[N_t = N_{t-1} + N_{t-2}\]fibOut <- vector('numeric', length = 20)fibOut[1:2] <- c(0,1)for(i in 3:length(fibOut)){
fibOut[i] <- fibOut[i-2] + fibOut[i-1]
}